-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Doc for Polars GPU Engine #16753
API Doc for Polars GPU Engine #16753
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fully convinced that it makes sense to show the basic queries and install instructions, rather than just linking to the polars docs.
Rationale: this leaves us two places we need to update things if anything needs changed.
.. code-block:: bash | ||
|
||
pip install polars[gpu] --extra-index-url=https://pypi.nvidia.com | ||
|
||
GPU-based execution can be triggered by simply running ``.collect(engine="gpu")`` instead of ``.collect()``. | ||
|
||
.. code-block:: python | ||
|
||
# Import the necessary library | ||
import polars as pl | ||
|
||
# Define the data for the LazyFrame | ||
ldf = pl.LazyFrame({ | ||
"a": [1.242, 1.535], | ||
}) | ||
|
||
print(ldf.select(pl.col("a").round(1)).collect(engine="gpu")) | ||
|
||
|
||
For finer control, you can pass a GPUEngine object with additional configuration parameters to the ``engine=`` parameter. | ||
|
||
.. code-block:: python | ||
|
||
# Import the necessary library | ||
import polars as pl | ||
|
||
# Define the data for the LazyFrame | ||
ldf = pl.LazyFrame({ | ||
"a": [1.242, 1.535], | ||
}) | ||
|
||
# Configure the GPU engine with advanced settings | ||
gpu_engine = pl.GPUEngine( | ||
device=0, | ||
raise_on_Fail=True # Ensure the engine fails loudly if it cannot execute on the GPU | ||
) | ||
|
||
# Execute the collection with the custom GPU engine configuration | ||
print(ldf.select(pl.col("a").round(1)).collect(engine=gpu_engine)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This replicates (approximately) the information that we are maintaining on the polars site. I think the better approach is to not have that here, but to just immediately link there. Perhaps we can have some benchmark results on this landing page?
removed the installation and sample code snippet.
I am aligned with the flow. Will add the benchmarks to the page next week. See - latest flow 093ce0c |
@singhmanas1 Can you write a proper title for this PR? |
Speed ups experience with Polars GPU Engine
Speed up with Polars GPU Engine for an 80 GB dataset
Added the benchmarks- 1. Query processing time versus dataset size. 2. Per query speedup for all 22 PDS-H queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM
Why is there no CI being run here? I want to preview these docs... |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few change requests - the only blocker is the "TBD" link. Everything else can be fixed in a follow-up PR if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Y axis label should be “Speedup (Polars CPU runtime / Polars GPU runtime)”
:width: 200px | ||
:target: https://colab.research.google.com/github/rapidsai-community/showcase/blob/main/accelerated_data_processing_examples/polars_gpu_engine_demo.ipynb | ||
|
||
Take the cuDF backend for Polars for a test-drive in a free GPU-enabled notebook environment using your Google account by `launching on Colab <TBD>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to fix this before merging!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I fixed this. I assume it's supposed to point to https://colab.research.google.com/github/rapidsai-community/showcase/blob/main/accelerated_data_processing_examples/polars_gpu_engine_demo.ipynb. If that's incorrect please fix it.
One other change request -- where do we link to this page? It needs to be linked from the cuDF docs somewhere, it should not be an orphaned page. Maybe in https://github.com/rapidsai/cudf/blob/branch-24.10/docs/cudf/source/index.rst. |
/ok to test |
1. Updated benchmark with a graph of speed ups on. compute heavy queries 2. Updated text description for the graph with compute heavy queries
Minor edits to the language
Minor language edits
Minor language edits
Minor language edits
Added hardware configuration for the benchmark
Updated the hardware specs
…t/manas_polars_docs
/ok to test |
/ok to test |
Modified the cudf API docs to add a page on cudf pandas detailing - 1) How to use? 2) How to learn more? 3) How to try on Google Colab?